Search CORE

94 research outputs found

Combining Technical Trading Rules Using Parallel Particle Swarm Optimization based on Hadoop

Author: Cheung DWL
Wang F
Yu PLH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Technical trading rules have been utilized in the stock markets to make profit for more than a century. However, no single trading rule can ever be expected to predict the stock price trend accurately. In fact, many investors and fund managers make trading decisions by combining a bunch of technical indicators. In this paper, we consider the complex stock trading strategy, called Performance-based Reward Strategy (PRS), proposed by [1]. Instead of combining two classes of technical trading rules, we expand the scope to combine the seven most popular classes of trading rules in financial markets, resulting in a total of 1059 component trading rules. Each component rule is assigned a starting weight and a reward/penalty mechanism based on rules' recent profit is proposed to update their weights over time. To determine the best parameter values of PRS, we employ an improved time variant particle swarm optimization (TVPSO) algorithm with the objective of maximizing the annual net profit generated by PRS. Due to a large number of component rules and swarm size, the optimization time is significant. A parallel PSO based on Hadoop, an open source parallel programming model of MapReduce, is employed to optimize PRS more efficiently. The experimental results show that PRS outperforms all of the component rules in the testing period.published_or_final_versio

Crossref

HKU Scholars Hub

Medical images retrieval by color content

Author: Cheung DWL
Fu A
Ng V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

Conference Theme: Intelligent Systems for the 21st CenturyWe develop an indexing scheme for medical images. In general, for a given medical image, there is one object which is clinically important amongst the rest. We name this object the dominant object. Our proposed index is composed of three parts: (1) the dominant objects in images are standardized; (2) each image will have its own quadtree constructed which depends on the color composition and the color variation in the image; and (3) an R-tree that supports the retrieval of color images. To demonstrate the effectiveness of the index developed, we use images of skin lesions as the image data. Our initial experiments gives promising results for image retrieval.published_or_final_versio

HKU Scholars Hub

A semantic similarity approach to electronic document modeling and integration

Author: Cheung DWL
Song WW
Tan CJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

The World Wide Web is an enormous collection of information resources serving various purposes. However, the diversity of the Web information, as well as its related formats, makes it very difficult for users to efficiently search and obtain the information they require. The reason for the difficulty is because most of the information uploaded on to the Web is unstructured or semi-structured. Many meta-data models have been proposed to respond to this problem. These models attempt to provide a certain kind of general description for the Web information in order to improve its structuredness. Although these documents consist of the largest portion of the Web information or Web resources, few meta-data models deal with ill-structured Web documents by analyzing their semantic relations with each other. In this paper, we consider this huge set of Web information, called electronic documents. We propose a meta-data model called the EDM (Electronic Document Metadata) model. Using this model, we can extract the semantic characteristics from electronic documents and then use these characteristics to form a semantic electronic document model. This model, inversely, provides a basis for the analysis of semantic similarity between electronic documents and for electronic document integration. This document modeling and integration supports further manipulations on the electronic documents, such as document exchange, searching and evolution.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Complex stock trading strategy based on particle swarm optimization

Author: Cheung DWL
Wang F
Yu PLH
Publication venue
Publication date: 01/01/2012
Field of study

Technical Session 1B - Advanced Algorithmic Trading – I: no. 41Trading rules have been utilized in the stock market to make profit for more than a century. However, only using a single trading rule may not be sufficient to predict the stock price trend accurately. Although some complex trading strategies combining various classes of trading rules have been proposed in the literature, they often pick only one rule for each class, which may lose valuable information from other rules in the same class. In this paper, a complex stock trading strategy, namely weight reward strategy (WRS), is proposed. WRS combines the two most popular classes of trading rules-moving average (MA) and trading range break-out (TRB). For both MA and TRB, WRS includes different combinations of the rule parameters to get a universe of 140 component trading rules in all. Each component rule is assigned a start weight and a reward/penalty mechanism based on profit is proposed to update these rules’ weights over time. To determine the best parameter values of WRS, we employ an improved time variant Particle Swarm Optimization (PSO) algorithm with the objective of maximizing the annual net profit generated by WRS. The experiments show that our proposed WRS optimized by PSO outperforms the best moving average and trading range break-out rules.postprin

HKU Scholars Hub

Cross table cubing: mining iceberg cubes from data warehouses

Author: Cheung DWL
Cho M
Pei J
Publication venue: Society for Industrial and Applied Mathematics.
Publication date: 01/01/2001
Field of study

All of the existing (iceberg) cube computation algorithms assume that the data is stored in a single base table, however, in practice, a data warehouse is often organized in a schema of multiple tables, such as star schema and snowflake schema. In terms of both computation time and space, materializing a universal base table by joining multiple tables is often very expensive or even unaffordable in real data warehouses. In this paper, we investigate the problem of computing iceberg cubes from data warehouses. Surprisingly, our study shows that computing iceberg cube from multiple tables directly can be even more efficient in both space and runtime than computing from a materialized universal base table. We develop an efficient algorithm, CTC (for Cross Table Cubing) to tackle the problem. An extensive performance study on synthetic data sets demonstrates that our new approach is efficient and scalable for large data warehouses.published_or_final_versio

Crossref

Secretaría de Estado de Cultura

HKU Scholars Hub

A Content-based search engine on medical images for telemedicine

Author: Cheung DWL
Lee CH
Ng V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Retrieving images by content and forming visual queries are important functionality of an image database system. Using textual descriptions to specify queries on image content is another important component of content-based search. The authors describe a medical image database system MIQS which supports visual queries such as query by example and query by sketch. In addition, it supports textual queries on spatial relationships between the objects of an image. MIQS is designed as a client-server application in which the client accesses the database and its images via the WWW.published_or_final_versio

HKU Scholars Hub

An empirical study on the visual cluster validation method with Fastmap

Author: Cheung DWL
Huang Z
Ng MK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

This paper presents an empirical study on the visual method for cluster validation based on the Fastmap projection. The visual cluster validation method attempts to tackle two clustering problems in data mining: to verify partitions of data created by a clustering algorithm; and to identify genuine clusters from data partitions. They are achieved through projecting objects and clusters by Fastmap to the 2D space and visually examining the results by humans. A Monte Carlo evaluation of the visual method was conducted. The validation results of the visual method were compared with the results of two internal statistical cluster validation indices, which shows that the visual method is in consistence with the statistical validation methods. This indicates that the visual cluster validation method is indeed effective and applicable to data mining applications.published_or_final_versio

HKU Scholars Hub

XML schema design and management for e-Government data interoperability

Author: Cheung DWL
Hon CT
Lee TY
Publication venue: United Kingdom
Publication date: 01/01/2009
Field of study

Open Access JournalThis journal issue entitled: ECEG 2009postprin

HKU Scholars Hub

Cross table cubing: mining iceberg cubes from data warehouses

Author: Cheung DWL
Cho M
Pei J
Publication venue: Society for Industrial and Applied Mathematics.
Publication date: 01/01/2005
Field of study

HKU Scholars Hub

Evaluating probabilistic queries over uncertain matching

Author: Cheng J
Cheng R
Cheung DWL
Gong J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

A matching between two database schemas, generated by machine learning techniques (e.g., COMA++), is often uncertain. Handling the uncertainty of schema matching has recently raised a lot of research interest, because the quality of applications rely on the matching result. We study query evaluation over an inexact schema matching, which is represented as a set of 'possible mappings', as well as the probabilities that they are correct. Since the number of possible mappings can be large, evaluating queries through these mappings can be expensive. By observing the fact that the possible mappings between two schemas often exhibit a high degree of overlap, we develop two efficient solutions. We also present a fast algorithm to compute answers with the k highest probabilities. An extensive evaluation on real schemas shows that our approaches improve the query performance by almost an order of magnitude. © 2012 IEEE.published_or_final_versionThe IEEE 28th International Conference on Data Engineering (ICDE 2012), Washington, D.C., 1-5 April 2012. In International Conference on Data Engineering Proceedings, 2012, p. 1096-110

HKU Scholars Hub